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ABSTRACT 

Online  surveys , customer  reviews  on  shopping  sites  are  the  key  sources  to  understand  customer  requirements 
and  feedback  to  help  upgrade  the  product  quality  and  achieve  greater  outcomes.  In  our  previous  paper , we  targeted  a 
novel  approach  to  extract  the  customer  sentiments  or  opinions  at  considerably  much  better  granular  level.  However ; 
there  are  many  challenges  in  dealing  with  human  languages.  We  only  concentrated  on  reviews  in  English  and  the 
reviews  which  are  much  straight  forward.  In  real  world,  customers  display  their  emotions  like  anger,  and  try  to  be 
sarcastic  sometimes.  In  addition  to  these  challenges,  we  need  to  deal  with  different  words  to  expressing  same  context.  In 
this  paper  we  want  to  take  it  forward  to  accept  reviews  from  other  languages  and  also  address  the  problem  of  unknown 
words  by  making  our  system  more  adaptive. 
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INTRODUCTION 

Just  like  in  the  previous  paper,  we  use  a dictionary  to  get  the  meanings  of  the  words  in  English.  However, 
the  current  paper  talks  about,  in  addition  to  using  the  dictionary,  updating  the  dictionary,  getting  the  meaning  of  an 
unknown  word  to  the  system,  understanding  any  idioms  or  predicting  sarcasm  in  the  comments  and  on  top  of  all 
predicting  the  language  in  use  and  translating  the  reviews  into  English  before  analysis. 

For  getting  the  reviews  of  any  product,  we  use  Amazon  APIs”,  flip  kart  APIs’  and  also  twitter  APIs’ 
configured  to  flume  (a  tool  used  along  with  Hadoop).  Once  the  reviews  are  flown  into  our  HDFS,  while 
processing,  language  of  the  reviews  is  identified  then  converted  into  English  and  then  applied  the  algorithm  for 
extracting  the  user  opinion. 

As  per  the  statistical  analysis  ratings  are  considered  based  on  structured  and  formal  reviews.  In  this 
concern  most  of  the  reviewers  are  not  considering  un-formal  and  un-structured  reviews  due  to  the  lack  of  regional 
languages  and  improper  specifications  of  reviews  like  Emojie’s,  stickers,  etc.  While  considering  this  kind  of 
reviews  for  ratings,  users  are  not  in  a position  to  judge  whether  the  product  is  qualitative  or  not.  And  also  some  of 
the  stack  holders  are  getting  difficulties  for  justifying  the  quality  attributes  of  their  products. 

PROPOSED  APPROACH 

By  considering  all  the  constraints  in  reviews  generation  of  the  products  we  are  going  to  propose  a new 
novel  approach  that  deals  with  both  formal,  un-formal  and  structured,  un-structured  reviews  we  can  get  qualitative 
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rating  for  the  products  in  all  the  aspects.  In  this  paper  we  concentrate  mainly  on  un-formal  reviews  in  the  form  of  regional 
languages. 

STRUCTURED  APPROACH 

For  the  conversion  of  regional  languages  we  can  use  Natural  Language  Tool  Kit  in  Python  (NLTK).  Natural 
Language  Processing  in  Python  provides  a practical  introduction  to  programming  for  language  processing.  Following  lines 
of  code  briefs  the  actual  processing. 

Import  Nltk 

sentence  = """At  seven  o'clock  on  Monday  morning 

Arthur  didn't  feel  good.""" 

tokens  = nltk.word_tokenize(sentence) 

tokens 

['At',  'seven',  "o'clock",  'on',  'Monday',  'morning', 

'Arthur',  'did',  "n't",  'feel',  'good', '.'] 
tagged  = nltk.pos_tag(tokens) 
tagged[0:6] 

[('At',  ’IN'),  ('seven',  'CD'),  ("o'clock",  JJ),  ('on',  'IN'), 

('Monday',  'NNP'),  ('morning',  'NN')] 

Identify  Named  Entities 

entities  = nltk.  chunk.  ne_chunk(tagged) 
entities 

Tree('S',  [('At',  'IN'),  ('seven',  'CD'),  ("o'clock",  'JJ'), 

('on',  'IN'),  ('Monday',  'NNP'),  ('morning',  'NN'), 

Tree('PERSON',  [('Arthur',  'NNP')]), 

('did',  'VBD'),  ("n't",  'RB'),  ('feel',  'VB'), 

('good',  'JJ'),  ('.', '.')]) 

Display  a parse  tree: 

from  nltk. corpus  import  treebank 

t = treebank.parsed_sents('wsj_0001.mrg')[0] 

t.drawQ 


Impact  Factor  (JCC):  7.1293 


NAAS  Rating:  3.76 


An  Approach  to  Construct  MDRT  to  Produce 
Qualitative  Ratings  for  E-Commerce  Websites 


11 


5 


NP-SEJ 

VP 

IMP 

r AD.JP 

i MD 

VP 

^ 

NlSp  ""nNP 

n 

. VB^^MP 

PP-CLH 

NP^TWP 

1 1 

/X  1 

1 /N 

/X 

Pwiffl  Vinton 

CD  NIMS  oki 

1 1 

pin  DT  HM 

I I 

IM 

HP 

NNP  CD 

1 1 

1 1 

61  years- 

1 1 

Ihe  board 

m dt 

JJ 

■'"w  1 i 

NN  Nov.  25 

| 

1 1 
a nonditacutwo 

1 

clirooior 

Figure  1 

ANUSAARAKA 

Anusaaraka  is  an  English-Hindi  language  accessing  software.  It  is  a machine  translation  tool  with  insights  from 
Panini's  Ashtadhyayi  (Method  for  Grammar  rules);  this  will  convert  the  Sanskrit  language  to  English  language. 

Anusaaraka  derives  its  name  from  the  Sanskrit  word  'Anusaran'  which  means  'to  heed'.  With  the  aim  to  reduce 
language  barriers  in  the  sentences  Anusaaraka  allows  the  user  to  import  text  in  a language  that  is  not  known  to  the  user. 
The  present  version  of  Anusaaraka  provides  for  translation  from  English  to  Hindi. 

Once  you  go  to  the  of  Anusaraka  site  you  will  be  able  to  know  more  about  grammar  conversion  and  also  be  able 
give  your  suggestions  to  the  Language  Resource  Development  of  this  software,  if  you  have  comfortable  knowledge  of 
English  and  Hindi. 

After  converting  the  un-formal  reviews  into  formal  reviews  the  next  phase  of  our  work  starts  i.e  mining  the  data 
using  any  one  of  the  best  practice  mining  tools 

DEPENDENCY  PARSING 

Dependency  Parsing  is  a technique  which  is  used  to  identify  the  key  terms  in  a review  or  in  a paragraph. 
The  identification  process  of  key  terms  identification  will  be  evaluated  in  three  steps  for  dependency  parsing 

• Syntactic  Representation 

• Parsing  Algorithm 

• Machine  Learning 

Syntactic  Representation 

Syntactic  structure  consists  of  lexical  items  linked  by  binary  asymmetric  relations  called  dependencies.  In 
syntactic  representation  each  sentence  is  organized  as  whole  which  belongs  to  elements  of  each  word  that  belongs  to  a 
sentence  case  by  itself  to  be  isolated  as  in  the  dictionary. 

The  structural  connections  established  dependency  relation  between  the  words.  Each  connection  in  principle  units 
consists  a superior  term  and  an  inferior  term. 
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Example  for  Syntactic  Representation 

Economic  news  had  little  effect  on  financial  markets 


Economic  news  had  little  effect  on  financial  markets 

From  the  above  sentence  “news”  is  a Noun  will  be  act  as  superior  term  which  is  followed  by  “had”  will  be  act  as 
inferior  term 


r~ii 


Economic  news  had  little  effect  on  financial  markets 


r~m 


Economic  news  had  little  effect  on  financial  markets 


r~  irn 


r ill  i 


Economic  news  Had  little  effect  on  financial  markets 


s t>j 


Economic  news  had  little  effect  on  financial  markets 


i mod  sbj 


i in  i 


i i 


Economic  news  had  little  effect  on  financial  markets 


nmod 


obj 


i i 

Economic  news  had  little  effect  on  financial  markets 

P 


nmod  shj 


r if 


obj 


pc 


lod 


nmod 


nmod 


Economic  news  had  little  effect  on  financial  markets 
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Parsing  Algorithm 

Training  data:  'J~  — (sentt,  cf&Pstyt=i_ 
w = 0;v  = Q;  / = O; 
for  n : 1..A/" 

for  t : 1 . - T 

wt'+1)  = update  w ( 1 ^ according  to  (senft,  depsf) 
v = v + w('+1} 

/ = / - h 1 

w — v /{J\f  ■ T) 

Stastical  Evaluation  of  Parsing  algorithm: 

Table  1 


English  Czech 


Parser 

w 

S 

W 

S 

k-best  MIRA  Eisner 

90.9 

37.5 

83.3 

31.3 

best  MIRA  CLE 

90.2 

33.2 

84.1 

32.2 

factored  MIRA  CLE 

90.2 

32.2 

84.4 

32.3 

Work  Approach 

On  a sample  product  I have  taken  few  reviews  as  a reference  for  converting  them  as  Multi-Dimensional  Review 
Table  (MDRT).  To  get  the  reviews  from  e-commerce  site  Amazon  we  have  used  API  integration  method  and  that  reviews 
will  be  generated  in  the  form  of  JSON  format. 
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For  Example  we  have  considered  5 sample  reviews  for  a product  Mobile  phone  and  generated  the  following 
sample  table  for  one  review. 

Table  2 


Quality  Parameters 

Good 

Average 

Bad 

Processor 

5 

0 

0 

Camera 

0 

3 

0 

RAM 

5 

0 

0 

Sensors 

0 

0 

1 

Battery 

0 

3 

0 

CONCLUSIONS 

After  generating  all  the  key  terms  or  indexed  terms  in  the  form  of  superior  and  inferior  words  we  are  going  to 
generate  a resultant  table  which  shows  the  efficiency  of  the  products  in  e-commerce  websites.  After  table  generation  we  are 
going  to  project  the  qualitative  ratings  of  a product  in  the  form  of  statistical  representation. 
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